
* File to match user zipcodes with EPA monitors(PM2.5) within 20 miles distance

***inputs:
* $GameOrigData/Centroids.txt
* $Data/aqs_sites.dta
* $Data/UserData.dta

***outputs:
* $Data/zip_centroids.dta
* $Data/aqs_sites_id.dta
* $Data/aqs_latlong.dta
* $Data/UserZips.dta
* $Data/UserMonitorMatch.dta
* $Data/UserMonitorMatch30miles.dta


***1. IMPORT CENTROIDS OF ZIP CODES ***
clear
import delimited using "$OrigData/Centroids.txt"
drop fid

save $Data/zip_centroids.dta, replace




***2. CREATE ID VAR FOR AQ SITES***

use $Data/daily_PM2p5_1318.dta, clear

keep statecode countycode sitenumber 

duplicates drop 

merge 1:1 statecode countycode sitenumber using $Data/aqs_sites.dta

keep if _merge ==3

di _N
gen id = _n // 

keep id latitude longitude statecode countycode sitenumber 


save $Data/aqs_sites_dailyid.dta, replace

keep id latitude longitude

save $Data/aqs_dailylatlong.dta, replace







***3. CALCULATE DISTANCES FOR EACH USER ZIP CODE AND AQ SITE***



	use $Data/UserZips.dta, clear
	
	
	rename zip zip3
	**merge with centroids
	
	merge 1:1 zip3 using $Data/zip_centroids.dta
	
	
	
	
	keep if _merge ==3 
	drop _merge
	
	expand 1192 // number of monitors in $Data/aqs_dailylatlong.dta
	
	by zip3, sort: gen id =_n
	
	merge m:1 id using $Data/aqs_dailylatlong.dta
	drop _merge 
	geodist lat longi latitude longitude , generate(aq_dist) miles
	
	
	
	
	
	
	
	sort zip3 aq_dist
	
	keep if aq_dist <= 30
	
	keep zip3 aq_dist id 
	sort zip3 aq_dist
	rename zip3 zip
	save $Data/UserMonitorMatch30miles.dta, replace
	**
	
	
	
	
	
	
	
	
	keep if aq_dist <= 20
	
	
	save $Data/UserMonitorMatch.dta, replace
	**
	





